Text analysis example
Word text analysis
For the text from text found on https://www.redfunnel.co.uk some analysis has been completed. This is shown below.
Table of words used
Below is the table of words and associated counts.
Number of characters per sentence/paragraph
Count of words - top 10
The top 10 words found by their number of occurrences within the text is shown below. There are 206 unique words used.
Wordcloud
A wordcloud of the text is provided below.
Comparison wordcloud
A wordcloud of the sentiment of text is provided below.
TF-IDF of unigram (single) words
This graph shows the tf-idf (term-frequency inverse-document frequency) of single words. Words that repeat often have a lower score, and a higher count of words.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Count of two word phrases - top 5
The top 5 2-word phrases, by the number of times they appear are below. There are 387 two word phrases.
Count of unique phrases by type
It may not look like it, but there are 1094 datapoints on this graph.
Of these:
Unigram is: 265 which is 24% of the total
Bigram is: 423 which is 39% of the total
Trigram is: 406 which is 37% of the total
Wordcloud - two word phrases
A wordcloud of the text is provided below.
Comparison two-phrase wordcloud
A wordcloud of the sentiment of text is provided below.
TF-IDF of bigram (two) words
This graph shows the tf-idf (term-frequency inverse-document frequency) of single words. Words that repeat often have a lower score, and a higher count of words.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Network graph of bigrams that occur more than once
This network graph shows the linkages used between two word phrases which occur more than once in the text.
Sentiment of text
The sentiment types of the text are shown below. There are 3 types of sentiment and the sentiment types are: Neutral, Positive, Negative.
Below is the table of text and associated sentiment.
Topic modelling
Estimated topics from the text, highest 10 scoring words on graph. There are 4 topics chosen.
Top 3 sentences
The top 3 scoring (summary extraction) sentences are below.
Word associations for top 2 words
Word 1 - isle
Word 1 is isle and the top 10 words associated with it are below
Word 2 - wight
Word 2 is wight and the top 10 words associated with it are below